Evolutionary Nearest Neighbour Classification Framework

نویسندگان

  • Amal Perera
  • D. G. Niroshini Dayaratne
  • William Perrizo
چکیده

Data classification attempts to assign a category or a class label to an unknown data object based on an available similar data set with class labels already assigned. K nearest neighbor (KNN) is a widely used classification technique in data mining. KNN assigns the majority class label of its closest neighbours to an unknown object, when classifying an unknown object. The computational efficiency and accuracy of KNN depends largely on the techniques used to identify its K nearest neighbours. The selection of a similarity metric to identify the neighbours and the selection of the optimum K as the number of neighbours can be considered as an optimization problem. The optimizing parameters for KNN are value for K, weight vector, voting power of neighbours, attribute selection and instance selection. Finding these values is a search problem with a large search space. Genetic Algorithms (GA) are considered to provide optimum solutions for search problems with a large search space. The search space is defined by the application domain. There are multiple real world classification applications that can utilize a parameter optimized KNN. Due to this, there is various research work carried out on using Genetic Algorithms for optimizing KNN classification. Even though multiple instances of research had been carried out on using GAs to optimize KNN there is no software framework available, which could be easily adapted to various application domains. This work is aimed towards building a framework to carry out the optimization of KNN classification with the help of a Genetic Algorithm. The developed framework provides a basic backbone for GA optimization of KNN while providing sufficient flexibility for the user, to extend it to specific application domains.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Natural Language Text Classification and Filtering with Trigrams and Evolutionary Nearest Neighbour Classifiers

N grams o er fast language independent multi-class text categorization. Text is reduced in a single pass to ngram vectors. These are assigned to one of several classes by a) nearest neighbour (KNN) and b) genetic algorithm operating on weights in a nearest neighbour classi er. 91% accuracy is found on binary classi cation on short multi-author technical English documents. This falls if more cat...

متن کامل

Evolutionary Approach to Overcome Initialization Parameters in Classification Problems

The design of nearest neighbour classifiers is very dependent from some crucial parameters involved in learning, like the number of prototypes to use, the initial localization of these prototypes, and a smoothing parameter. These parameters have to be found by a trial and error process or by some automatic methods. In this work, an evolutionary approach based on Nearest Neighbour Classifier (EN...

متن کامل

Evolutionary Approach to Overcome Initialization Parameters in Classification Problems

The design of nearest neighbour classifiers is very dependent from some crucial parameters involved in learning, like the number of prototypes to use, the initial localization of these prototypes, and a smoothing parameter. These parameters have to be found by a trial and error process or by some automatic methods. In this work, an evolutionary approach based on Nearest Neighbour Classifier (EN...

متن کامل

A Hyper-Heuristic Classifier for One Dimensional Bin Packing Problems: Improving Classification Accuracy by Attribute Evolution

A hyper-heuristic for the one dimensional bin packing problem is presented that uses an Evolutionary Algorithm (EA) to evolve a set of attributes that characterise a problem instance. The EA evolves divisions of variable quantity and dimension that represent ranges of a bin’s capacity and are used to train a k-nearest neighbour algorithm. Once trained the classifier selects a single determinist...

متن کامل

Spam Classification Using Nearest Neighbour Techniques

Spam mail classification and filtering is a commonly investigated problem, yet there has been little research into the application of nearest neighbour classifiers in this field. This paper examines the possibility of using a nearest neighbour algorithm for simple, word based spam mail classification. This approach is compared to a neural network, and decision-tree along with results published ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009